Test Equating by Common Items and Common Subjects: Concepts and Applications. - Practical Assessment, Research & Evaluation
نویسنده
چکیده
Since the invention of z-scores (standardized scores), comparison among different tests has been widely conducted by test developers, instructors, educational researchers, and psychometricians. Equating, calibration, and moderation are terms used to describe broad levels of possible comparison among educational assessments (Dorans, 2004; Feuer, Holland, Green, Bertenthal, & Hemphill, 1999; Linn, 1993; Mislevy, 1992). Equating is at one end of the linking continuum, involving the most stringent requirements of equivalence among the assessments and examinee populations to be linked, and compares tests that measure the same construct and have been designed to be equivalent. Less equivalent conditions involve calibration, which compares tests that measure the same construct but vary in design or difficulty, and moderation, which compares tests that measure different constructs. Psychometric approaches to linking assessments include linear equating, equipercentile equating, and item response theory (IRT). This article is a practical guide to conducting IRT test equating in two different scenarios:
منابع مشابه
Test Equating by Common Items and Common Subjects: Concepts and Applications
Since the invention of z-scores (standardized scores), comparison among different tests has been widely conducted by test developers, instructors, educational researchers, and psychometricians. Equating, calibration, and moderation are terms used to describe broad levels of possible comparison among educational assessments (Dorans, 2004; Feuer, Holland, Green, Bertenthal, & Hemphill, 1999; Linn...
متن کاملAn Illustration of a Mantel-Haenszel Procedure to Flag Misbehaving Common Items in Test Equating - Practical Assessment, Research & Evaluation
In this study the Mantel-Haenszel procedure, widely used in studies for identifying differential item functioning, is proposed as an alternative to the delta-plot method and applied in a test-equating context for flagging common items that behave differentially across cohorts of examinees. The Mantel-Haenszel procedure has the advantage of conditioning on ability when making comparisons of perf...
متن کاملComparison of proficiency in an anesthesiology course across distinct medical student cohorts: psychometric approaches to test equating.
BACKGROUND Examinations are necessary for assessment of student proficiency in medical education, but comparison of achievement across different cohorts in different tests is challenging. We applied psychometric test equating methods to compare student proficiency in two different examinations for a clinical anesthesiology course. METHODS Each examination contained 50 multiple choice items an...
متن کاملExamining the Impact of Drifted Polytomous Anchor Items on Test Characteristic Curve (TCC) Linking and IRT True Score Equating
As part of its nonprofit mission, ETS conducts and disseminates the results of research to advance quality and equity in education and assessment for the benefit of ETS's constituents and the field. To obtain a PDF or a print copy of a report, please visit: Abstract In a common-item (anchor) equating design, the common items should be evaluated for item parameter drift. Drifted items are often ...
متن کاملInvestigating Content and Construct Representation of a Common-item Design When Creating a Vertically Scaled Test
According to the equating guidelines, a set of common items should be a mini version of the total test in terms of content and statistical representation (Kolen & Brennan, 2004). Differences between vertical scaling and equating would suggest that these guidelines may not apply to vertical scaling in the same way that they apply to equating. This study investigated how well the guideline of con...
متن کامل